training epoch
Supplementary Material
All code can be downloaded from https://github.com/Shanka123/OCRA, Figure task is to S1: say Abstract whether Reasoning they are the T same asks (AR or dif T). ferent. Same/differ Relational ent: matc Two h-to-sample: objects are presented, A source and pair the of objects is presented that either instantiates a'same' or'different' relation, and the task is to select the pair in a 2 of tar 2 get array objects format, (out with of tw the o pairs) source th pair at instantiates presented in the the same top relation. The of task is to select the missing object from a set of four choices. Problems were presented in a 2 3 array each answer format, choice, with one see of Figure the answer S8). Identity choices rules: inserted An into abstract the bottom pattern right is instantiated cell (separate in the images first ro for w (AB instantiated A, ABB, in or the AAA), second and ro the w.
Gradient Flossing: Improving Gradient Descent through Dynamic Control of Jacobians
Training recurrent neural networks (RNNs) remains a challenge due to the instability of gradients across long time horizons, which can lead to exploding and vanishing gradients. Recent research has linked these problems to the values of Lyapunov exponents for the forward-dynamics, which describe the growth or shrinkage of infinitesimal perturbations. Here, we propose gradient flossing, a novel approach to tackling gradient instability by pushing Lyapunov exponents of the forward dynamics toward zero during learning. We achieve this by regularizing Lyapunov exponents through backpropagation using differentiable linear algebra. This enables us to "floss" the gradients, stabilizing them and thus improving network training.
Gradient Flossing: Improving Gradient Descent through Dynamic Control of Jacobians
Training recurrent neural networks (RNNs) remains a challenge due to the instability of gradients across long time horizons, which can lead to exploding and vanishing gradients. Recent research has linked these problems to the values of Lyapunov exponents for the forward-dynamics, which describe the growth or shrinkage of infinitesimal perturbations. Here, we propose gradient flossing, a novel approach to tackling gradient instability by pushing Lyapunov exponents of the forward dynamics toward zero during learning. We achieve this by regularizing Lyapunov exponents through backpropagation using differentiable linear algebra. This enables us to "floss" the gradients, stabilizing them and thus improving network training.
Architecture
In this section, we provide comprehensive details about the Transformer model architectures considered in this work. We implement all models in PyTorch [61] and adapt the implementation of Transformer-XL from VPT [4]. A.1 Observation Encoding Experiments conducted on both DMLab and RoboMimic include RGB image observations. For models trained on DMLab, we use a ConvNet [29] similar to the one used in Espeholt et al. [20]. For models trained on RoboMimic, we follow Mandlekar et al. [53] to use a ResNet-18 network [29] followed by a spatial-softmax layer [23].